Scalable Preference Learning from Data Streams
نویسندگان
چکیده
We study the task of learning the preferences of online readers of news, based on their past choices. Previous work has shown that it is possible to model this situation as a competition between articles, where the most appealing articles of the day are those selected by the most users. The appeal of an article can be computed from its textual content, and the evaluation function can be learned from training data. In this paper, we show how this task can benefit from an efficient algorithm, based on hashing representations, which enables it to be deployed on high intensity data streams. We demonstrate the effectiveness of this approach on four real world news streams, compare it with standard approaches, and describe a new online demonstration based on this technology.
منابع مشابه
TECNO-STREAMS: Tracking Evolving Clusters in Noisy Data Streams with a Scalable Immune System Learning Model
Artificial Immune System (AIS) models hold many promises in the field of unsupervised learning. However, existing models are not scalable, which makes them of limited use in data mining. We propose a new AIS based clustering approach (TECNO-STREAMS) that addresses the weaknesses of current AIS models. Compared to existing AIS based techniques, our approach exhibits superior learning abilities, ...
متن کاملScalable e-Learning Multimedia Adaptation Architecture
A neglected challenge in existing e-Learning (eL) systems is providing access to multimedia to all users regardless of environmental conditions such as diverse device capabilities, the heterogeneity of the underlying IP network, and user modality preference. This paper proposes a novel two-tier transcoding framework capable of adapting eL multimedia to meet the end-user environmental challenges...
متن کاملMining Low Dimensionality Data Streams of Continuous Attributes
This paper presents an incremental and scalable learning algorithm in order to mine numeric, low dimensionality, high–cardinality, time–changing data streams. Within the Supervised Learning field, our approach, named SCALLOP, provides a set of decision rules whose size is very near to the number of concepts to be extracted. Experimental results with synthetic databases of different complexity d...
متن کاملCollaborative Context-aware Preference Learning
Preference learning methods work by exploiting patterns in the data that relate users to items. Preference data often includes information such as the context of a recommendation (e.g. time/date, location). Leveraging this data (e.g. click logs, purchase/usage data) can significantly improve the relevance and quality of the recommendation. In this work we introduce a novel scalable context-awar...
متن کاملJubatus: An Open Source Platform for Distributed Online Machine Learning
Distributed computing is essential for handling very large datasets. Online learning is also promising for learning from rapid data streams. However, it is still an unresolved problem how to combine them for scalable learning and prediction on big data streams. We propose a general computational framework called loose model sharing for online and distributed machine learning. The key is to shar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015